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Abstract 

The configuration interaction (CI) is a versatile wavefunction theory for interacting fermions but 
it involves an extremely long CI series. Using a symmetric tensor decomposition (STD) method, 
we convert the CI series into a compact and numerically tractable form. The converted series 
encompasses the Hartree-Fock state in the first term and rapidly converges to the full-CI state, 
as numerically tested using small molecules. Provided that the length of the STD-CI series grows 
only moderately with the increasing complexity of the system, the new method will serve as one 
of the alternative variational methods to achieve full-CI with enhanced practicability. 



An accurate description of the ground-state wavefunction of an interacting Fermion sys- 
tem is one of the central goals of modern science. The most straightforward and versatile 
approach to describing this wavefunction is the configuration interaction (CI), but its nu- 
merical application is greatly limited by the fact that the full-CI series consists of mCn 
Slater determinants (SDs) when describing an iV-electron system using M basis functions. 
To truncate this extremely long CI series without compromising on chemical accuracy, many 
methods have been developed, such as the multi-reference CI, which uses a part of the SDs 
derived from a few of the most important ones, or the complete active space (CAS) CI which 
uses all SDs generated from a selected set of orbitals [1]. Even so, the application has been 
hampered by the slow convergency of the CI series. 

In this context, the many-body perturbation approaches to treat all SDs have attracted 
attention; these approaches include the coupled cluster (CC) theory [2] [3] which is used to 
represent the wavefunction in terms of an SD (or a few SDs) applied with the exponential 
of an excitation operator. The CC theory has proven accurate for a number of molecules, 
although it occasionally provides qualitatively incorrect potential surfaces (J). The density 
matrix renormalization group (DMRG) method 5] has also attracted attention as a vari- 
ational method within the space of the matrix product state 6|. It has been extensively 
applied to correlated electron systems 7]js]; however, this method was originally formulated 
only for one-dimensional systems and its extension to three-dimensional systems is not very 
straightforward. 

Recent tensor analyses have shown that, despite the large number, the CI coefficients 
may be described by a tractable number of variational parameters. For example, the full-CI 
results of some molecules were accurately reproduced by the complete-graph tensor network 
CGTN) state containing ~ M 2 variational parameters [s][loj]. Tensor decomposition (TD) 



ll| methods such as the Tucker decomposition 12| and the canonical decomposition (CAN- 



DECOMP) /parallel factor decomposition (PARAFAC), abbreviated as CP, [13J [14j have 
also been applied to molecules. These methods were used to analyze the double excitation 
tensor T2 originating from the electron-electron interaction 15] 16] . The results showed that 
the 72 tensor of rank 4, consisting of ~ M 4 terms, can be described by ~ MK parameters, 
where K denotes the length of the tensor decomposition [if]]. The TD method was also sug- 
gested as being effective in greatly reducing the variational parameters required for full-CI 
15]. 
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In this context, we formulate a practical scheme to perform full-CI level calculation using 
a TD method. In this study, we describe the CI coefficients as a product of a symmetric 
tensor and the permutation tensor, and following the CP procedure we expand the former 
into K symmetric Kronecker product states, which are composed of vectors of dimension M. 
Subsequently, we calculate the second-order density matrix consisting of ~ K 2 M A elements 



using the Vieta's formula 18[ thereby performing ~ M 2 operations for each element. This 
allows us to perform the total energy calculation variationally using ~ K 2 M 6 operations. 
Our test calculations for the potential surface of simple diatomic molecules and for a Hub- 
bard cluster model with different parameters show that with increasing K , the total energy 
rapidly converges to the full-CI result. This shows that our symmetric tensor decomposition 
CI (STD-CI) scheme will greatly extend the applicability of the full-CI level calculation, 
provided that K increases only moderately with N, M, or the complexity of the electron 
correlation. In the rest of this paper, we provide the details of STD-CI. 

We begin by describing the Cl-series representation of the many-body wavefunction 

M 

if? (xi-- -X N ) = ^ Aii-iN^h M - ' ' V'ijv ( x n) , (1) 
h---i N =l 

where x% • • -x^ represent the space and spin coordinates of the electrons and ipi k s are the 
orthonormal orbitals which are represented as a linear combination of orthonormalized basis 
functions as 

3 

The antisymmetric tensor Ai v ..i N can be described as the product of a symmetric tensor 
(Si 1 ...i N ) of rank iV and dimension M and a product of iV (N — 1) /2 permutation tensors 
(ejj-s) of rank 2 as 

Next, Si 1 ...i N is decomposed into a minimal linear combination of symmetric Kronecker 



product states using vectors of dimension M, cj 



K 



C; , as 



Sil-iN - ^'^i " '°iN- ( 3 ) 
3=1 
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This symmetric tensor decomposition (STD) is a symmetric version of CP, which is also a 
special case of the symmetric Tucker decomposition 

Si v ..i N — S ji---3N U i 1 ' ' ' U i N (^) 

jl—JN 

in that the transformed tensor Sj 1 j 2 ...j N is the superdiagonal Xj in CP. The total energy is 
optimized by varying the vectors and the unitary matrix Uij, so that no approximation is 
made in our STD-CI method apart from the truncation of the series at K. It is noteworthy 
that each term in the STD series contains all the SDs generated from the orbitals ipi, although 
the degrees of freedom are only MK + M (M + 1) /2 as a whole, thereby indicating that 
we are treating the entangled states and that the degree of entanglement is reduced with 
increasing K. It can be shown that the Hartree-Fock (HF) approximation corresponds to 
taking K = 1 and c]^ +1 = ■ • • = c\j — 0; therefore, the approximation with K = 1 is already 
a natural extension of the HF approximation. When treating a weakly correlated system, 
an HF-like solution is obtained and on the other hand, when treating a strongly correlated 
system, an orbital-ordered solution is obtained, provided that a sufficiently large value of K 
is considered. In this manner, we can bridge the HF solution with the fully correlated state 
by increasing the value of K. The STD-CI will be exact when K =m Cjy- 

Our numerical procedure begins by constructing the second-order density matrix (DM), 
which has the form 

M 

72 (XiX 2 , X3X4) = T hi2i3iM 1 Oi) C (x 2 ) (x 3 ) ip u (x 4 ) . 

Using d2J) and (J2D, the DM coefficient can be rewritten as 

K 
ij=l 

where I for each set of indices {iji^isu} is expressed, using = ^k^k e iiki e i2h e nh e iih-i as 

M 

1= a k 3 ■ ■ ■ a-k N {tk 3 -k N ) 2 ■ (5) 

k 3 ---k N =l 

Based on the fact that the permutation tensor squared is equal to 1 when all the indices 
are different and otherwise, it can be shown using Vieta's formula that the value of / 
is equal to the M — (N — 2)-th order coefficients of the polynomial (N — 2)\f M (t) with 



f M (t) = (t + ai) ■ ■ ■ (t + au) [19]. The coefficient can be easily obtained by using a list 
manipulation, where the coefficients of f p (t) with < p < M are described by a row-vector 
of dimension p as f p = (f P fl, f Pt i, ■ ■ ■ , f P , P -i) and are applied with the iterative equation, 
/ = a p (/__!, 0) + (0, considering f = l. oc M 2 operations are required to obtain 

the coefficient of Therefore, the total number of operations needed to obtain all J's 

is oc K 2 M 6 . In practical coding, one may use the fact that cii k = when is equal to one 
of the four indices {^2^4} to achieve further efficiency. 

Subsequently the parameters c^ fc , A- 7 , and Uy are varied to minimize the total energy 

Etot ^ ^«li2«3*4-^«3i4n«2 / S ^11121112 With 

^!i 2 i3*4 = / dxidx 2 ^* 1 (xi) ip* 2 (x 2 



. 1 1„ 2 , A iV(iV + l) 1 



2 1 ^ v i7 y 2 |ri-r 2 | 

x ^ 3 A 4 M , (6) 



where v ext denotes the external potential. In the variation, we require derivatives of I 1 ? 
with respect to a,i k for those i%. not in {i^isi^}. To obtain the derivatives, we need to 
differentiate /at(£) by <2j fc and obtain its M — (N — l)-th coefficient. When this is done 
simply using the list manipulation, oc K 2 M 7 operations are required for each if,; however, 
the number of operations can be reduced when using Jm (t) / (t + for the differentiation. 
When the series (t+a^)- 1 = J2m=o a i" 1 ^ ( _t ) m is multi P lied with f*t (t), the M-(iV-l)-th 
coefficient can be obtained as 

Af-(JV-l) 

E, \S-1-M+(N-1) r 

s=0 

thereby requiring oc M operations for each k. Therefore, ~ K 2 M 6 operations are required 
to obtain all the derivatives of ^j 2 i 3 j 4 - By applying the same technique to the expression 
/ (t) I (t + ai) (t + a,j), the second derivatives are similarly obtained with oc K 2 M 6 opera- 
tions. It should be noted that the calculation of the derivatives is the rate-determining step 
in our calculation. 

In a manner similar to HF [2(3], {2]]], STD-CI can be applied to a crystalline solid by taking 
a linear combination of the atomic orbitals \% as 

<Mr) = 5> ife ^ (r-R T ), 

T 

where k and R T denote the reciprocal vector in the Brillouin zone and the nuclear coordinate, 
respectively. Thus, M should be read as the number of k values multiplied by the number 
of basis functions. 
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To assess the efficiency of STD-CI, we investigate how many terms in Eq.([3]) are required 
to achieve convergence in E tot . This investigation is carried out for simple diatomic molecules 
(H 2 , He2, and LiH) and a four-site Hubbard model. Relativistic effects are neglected and 
only the spin unpolarized state is calculated by using the same number of orbitals with 
an a and /3 spin. In testing the convergence, the calculated results are compared with 
the full CI calculation performed using the same basis functions. In our calculations, the 
Newton-Raphson method is used to variationally determine the parameters. 

H2 is the simplest molecule where the molecular orbital picture, valid near the equilibrium 
bond length, is switched to the Heitler-London picture, as the interatomic distance increases 
to infinity. The calculation with the STO-3G basis set shows that K = 1 reproduces the full 
CI potential curve within an error of 0.01 Ha error, while the error is less than 0.01 mHa 
when K = 2 (Fig. [1]). The molecule He2 is weakly bound the dispersion forces and the test 
is more stringent in this case. The calculation with the 6-31 1G basis set shows that K = 3 is 
sufficient to reproduce the full CI result within an error of 0.01 mHa, while K = 2 is already 
sufficient to obtain the binding energy within the same accuracy although the absolute value 
of E to t is always larger by 0.1 mHa (Fig. |2]). The binding energy is about three times larger 



than the accurate quantum chemical calculation [22] and the experimental results 23|, which 
is presumably due to the insufficient number of basis functions; obtaining an accurate value 
of the binding energy is beyond the scope of our comparative study, and this must be the 
consideration of future studies. LiH is a typical hetero-nuclear diatomic molecule. The 
4-31G calculation for this case shows that K = 1 nearly sufficiently reproduces the full CI 
result while the HF calculation significantly underestimates the binding energy (Fig. [3]). The 
final test is the application of our idea to the four-site Hubbard model in the tetrahedron 
structure under the half-filled condition. As the Hubbard U over the transfer t increases, 
larger K values are required; however, K = 6 is found sufficient even in the large U/t limit 

(Fig. UK . 

The computational time theoretically scales as K 2 M 6 . We tested the time scaling with 
our numerical code to find that CPU time indeed scales as K^M on average (Figs. 
5|6|) . Because the operations involved in the calculation can be performed independently, 
this method is suitable for massively parallel computers. 

We formulated a practical scheme to perform the full-CI level calculation using the STD 
method. In the STD-CI method, we expand the CI coefficients as the product of a sym- 
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FIG. 1: Potential curve of H2 molecules for full-CI (solid line), STD-CI with K = 1 (broken line 
with cross) and K = 2 (broken line with asterisk), and HF (dotted line). 
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FIG. 2: Potential curve of He2 molecules for full-CI (solid line) and STD-CI with K = 2 (broken 
line with cross) and K = 3 (broken line with asterisk). 



metric tensor and the permutation tensor, and we further expand the symmetric tensor into 
Kronecker product states composed of vectors of dimension M. By varying the vectors and 
the unitary transformation matrix Uij, the total energy is minimized using the second-order 
density matrix technique. The STD-CI method, which involves taking the length of the 
series K as the only input parameter, allows us to perform a full-CI level calculation rig- 
orously using KM + M (M + 1) /2 variational parameters and ~ K 2 M 6 operations. By 
applying the scheme to the potential curve of small diatomic molecules such as H2, He2, 
and LiH, and the four-site Hubbard model for various parameters, we found that a very 
small K value is required to reproduce the full-CI results within milli-Hartree accuracy. If 
K increases moderately with N, M, or the degree of correlation, the scheme will greatly 
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FIG. 3: Potential curve of LiH molecules for full-CI (solid line), STD-CI with K = 1 (broken line 
with cross) and K = 2 (broken line with asterisk), and HF (dotted line). 
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FIG. 4: Error in Ej t for the four-site Hubbard model for various parameters U/t = 1 (solid line), 
U/t = 100 (dotted line with cross), and U/t = 10000 (dotted line with asterisk). 



extend the applicability of the full-CI level calculation. We believe that application of the 
scheme to a crystalline solid and the use of the scheme as a building block of the fragment 
molecular orbital (FMO) scheme can be of high significance 24]. 
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FIG. 5: Total CPU time (T) versus K. The solid line indicates result after measurement for one 
iteration step and broke line indicates that after fitting to T = aK b . b is obtained as 1.97. 
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FIG. 6: Total CPU time (T) versus M. The solid line indicates result after measurement for one 
iteration step and the broken line indicates that after fitting to T = aM b . b is 5.97 after the fitting. 
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